LASAFT-Net-v2 light version for MDX ChallengeΒΆ

Sample

!git clone https://github.com/ws-choi/LASAFT-Net-v2
%cd LASAFT-Net-v2
Cloning into 'LASAFT-Net-v2'...
remote: Enumerating objects: 245, done.
remote: Counting objects:   0% (1/245)
remote: Counting objects:   1% (3/245)
remote: Counting objects:   2% (5/245)
remote: Counting objects:   3% (8/245)
remote: Counting objects:   4% (10/245)
remote: Counting objects:   5% (13/245)
remote: Counting objects:   6% (15/245)
remote: Counting objects:   7% (18/245)
remote: Counting objects:   8% (20/245)
remote: Counting objects:   9% (23/245)
remote: Counting objects:  10% (25/245)
remote: Counting objects:  11% (27/245)
remote: Counting objects:  12% (30/245)
remote: Counting objects:  13% (32/245)
remote: Counting objects:  14% (35/245)
remote: Counting objects:  15% (37/245)
remote: Counting objects:  16% (40/245)
remote: Counting objects:  17% (42/245)
remote: Counting objects:  18% (45/245)
remote: Counting objects:  19% (47/245)
remote: Counting objects:  20% (49/245)
remote: Counting objects:  21% (52/245)
remote: Counting objects:  22% (54/245)
remote: Counting objects:  23% (57/245)
remote: Counting objects:  24% (59/245)
remote: Counting objects:  25% (62/245)
remote: Counting objects:  26% (64/245)
remote: Counting objects:  27% (67/245)
remote: Counting objects:  28% (69/245)
remote: Counting objects:  29% (72/245)
remote: Counting objects:  30% (74/245)
remote: Counting objects:  31% (76/245)
remote: Counting objects:  32% (79/245)
remote: Counting objects:  33% (81/245)
remote: Counting objects:  34% (84/245)
remote: Counting objects:  35% (86/245)
remote: Counting objects:  36% (89/245)
remote: Counting objects:  37% (91/245)
remote: Counting objects:  38% (94/245)
remote: Counting objects:  39% (96/245)
remote: Counting objects:  40% (98/245)
remote: Counting objects:  41% (101/245)
remote: Counting objects:  42% (103/245)
remote: Counting objects:  43% (106/245)
remote: Counting objects:  44% (108/245)
remote: Counting objects:  45% (111/245)
remote: Counting objects:  46% (113/245)
remote: Counting objects:  47% (116/245)
remote: Counting objects:  48% (118/245)
remote: Counting objects:  49% (121/245)
remote: Counting objects:  50% (123/245)
remote: Counting objects:  51% (125/245)
remote: Counting objects:  52% (128/245)
remote: Counting objects:  53% (130/245)
remote: Counting objects:  54% (133/245)
remote: Counting objects:  55% (135/245)
remote: Counting objects:  56% (138/245)
remote: Counting objects:  57% (140/245)
remote: Counting objects:  58% (143/245)
remote: Counting objects:  59% (145/245)
remote: Counting objects:  60% (147/245)
remote: Counting objects:  61% (150/245)
remote: Counting objects:  62% (152/245)
remote: Counting objects:  63% (155/245)
remote: Counting objects:  64% (157/245)
remote: Counting objects:  65% (160/245)
remote: Counting objects:  66% (162/245)
remote: Counting objects:  67% (165/245)
remote: Counting objects:  68% (167/245)
remote: Counting objects:  69% (170/245)
remote: Counting objects:  70% (172/245)
remote: Counting objects:  71% (174/245)
remote: Counting objects:  72% (177/245)
remote: Counting objects:  73% (179/245)
remote: Counting objects:  74% (182/245)
remote: Counting objects:  75% (184/245)
remote: Counting objects:  76% (187/245)
remote: Counting objects:  77% (189/245)
remote: Counting objects:  78% (192/245)
remote: Counting objects:  79% (194/245)
remote: Counting objects:  80% (196/245)
remote: Counting objects:  81% (199/245)
remote: Counting objects:  82% (201/245)
remote: Counting objects:  83% (204/245)
remote: Counting objects:  84% (206/245)
remote: Counting objects:  85% (209/245)
remote: Counting objects:  86% (211/245)
remote: Counting objects:  87% (214/245)
remote: Counting objects:  88% (216/245)
remote: Counting objects:  89% (219/245)
remote: Counting objects:  90% (221/245)
remote: Counting objects:  91% (223/245)
remote: Counting objects:  92% (226/245)
remote: Counting objects:  93% (228/245)
remote: Counting objects:  94% (231/245)
remote: Counting objects:  95% (233/245)
remote: Counting objects:  96% (236/245)
remote: Counting objects:  97% (238/245)
remote: Counting objects:  98% (241/245)
remote: Counting objects:  99% (243/245)
remote: Counting objects: 100% (245/245)
remote: Counting objects: 100% (245/245), done.
remote: Compressing objects:   0% (1/169)
remote: Compressing objects:   1% (2/169)
remote: Compressing objects:   2% (4/169)
remote: Compressing objects:   3% (6/169)
remote: Compressing objects:   4% (7/169)
remote: Compressing objects:   5% (9/169)
remote: Compressing objects:   6% (11/169)
remote: Compressing objects:   7% (12/169)
remote: Compressing objects:   8% (14/169)
remote: Compressing objects:   9% (16/169)
remote: Compressing objects:  10% (17/169)
remote: Compressing objects:  11% (19/169)
remote: Compressing objects:  12% (21/169)
remote: Compressing objects:  13% (22/169)
remote: Compressing objects:  14% (24/169)
remote: Compressing objects:  15% (26/169)
remote: Compressing objects:  16% (28/169)
remote: Compressing objects:  17% (29/169)
remote: Compressing objects:  18% (31/169)
remote: Compressing objects:  19% (33/169)
remote: Compressing objects:  20% (34/169)
remote: Compressing objects:  21% (36/169)
remote: Compressing objects:  22% (38/169)
remote: Compressing objects:  23% (39/169)
remote: Compressing objects:  24% (41/169)
remote: Compressing objects:  25% (43/169)
remote: Compressing objects:  26% (44/169)
remote: Compressing objects:  27% (46/169)
remote: Compressing objects:  28% (48/169)
remote: Compressing objects:  29% (50/169)
remote: Compressing objects:  30% (51/169)
remote: Compressing objects:  31% (53/169)
remote: Compressing objects:  32% (55/169)
remote: Compressing objects:  33% (56/169)
remote: Compressing objects:  34% (58/169)
remote: Compressing objects:  35% (60/169)
remote: Compressing objects:  36% (61/169)
remote: Compressing objects:  37% (63/169)
remote: Compressing objects:  38% (65/169)
remote: Compressing objects:  39% (66/169)
remote: Compressing objects:  40% (68/169)
remote: Compressing objects:  41% (70/169)
remote: Compressing objects:  42% (71/169)
remote: Compressing objects:  43% (73/169)
remote: Compressing objects:  44% (75/169)
remote: Compressing objects:  45% (77/169)
remote: Compressing objects:  46% (78/169)
remote: Compressing objects:  47% (80/169)
remote: Compressing objects:  48% (82/169)
remote: Compressing objects:  49% (83/169)
remote: Compressing objects:  50% (85/169)
remote: Compressing objects:  51% (87/169)
remote: Compressing objects:  52% (88/169)
remote: Compressing objects:  53% (90/169)
remote: Compressing objects:  54% (92/169)
remote: Compressing objects:  55% (93/169)
remote: Compressing objects:  56% (95/169)
remote: Compressing objects:  57% (97/169)
remote: Compressing objects:  58% (99/169)
remote: Compressing objects:  59% (100/169)
remote: Compressing objects:  60% (102/169)
remote: Compressing objects:  61% (104/169)
remote: Compressing objects:  62% (105/169)
remote: Compressing objects:  63% (107/169)
remote: Compressing objects:  64% (109/169)
remote: Compressing objects:  65% (110/169)
remote: Compressing objects:  66% (112/169)
remote: Compressing objects:  67% (114/169)
remote: Compressing objects:  68% (115/169)
remote: Compressing objects:  69% (117/169)
remote: Compressing objects:  70% (119/169)
remote: Compressing objects:  71% (120/169)
remote: Compressing objects:  72% (122/169)
remote: Compressing objects:  73% (124/169)
remote: Compressing objects:  74% (126/169)
remote: Compressing objects:  75% (127/169)
remote: Compressing objects:  76% (129/169)
remote: Compressing objects:  77% (131/169)
remote: Compressing objects:  78% (132/169)
remote: Compressing objects:  79% (134/169)
remote: Compressing objects:  80% (136/169)
remote: Compressing objects:  81% (137/169)
remote: Compressing objects:  82% (139/169)
remote: Compressing objects:  83% (141/169)
remote: Compressing objects:  84% (142/169)
remote: Compressing objects:  85% (144/169)
remote: Compressing objects:  86% (146/169)
remote: Compressing objects:  87% (148/169)
remote: Compressing objects:  88% (149/169)
remote: Compressing objects:  89% (151/169)
remote: Compressing objects:  90% (153/169)
remote: Compressing objects:  91% (154/169)
remote: Compressing objects:  92% (156/169)
remote: Compressing objects:  93% (158/169)
remote: Compressing objects:  94% (159/169)
remote: Compressing objects:  95% (161/169)
remote: Compressing objects:  96% (163/169)
remote: Compressing objects:  97% (164/169)
remote: Compressing objects:  98% (166/169)
remote: Compressing objects:  99% (168/169)
remote: Compressing objects: 100% (169/169)
remote: Compressing objects: 100% (169/169), done.
Receiving objects:   0% (1/245)
Receiving objects:   1% (3/245)
Receiving objects:   2% (5/245)
Receiving objects:   3% (8/245)
Receiving objects:   4% (10/245)
Receiving objects:   5% (13/245)
Receiving objects:   6% (15/245)
Receiving objects:   7% (18/245)
Receiving objects:   8% (20/245)
Receiving objects:   9% (23/245)
Receiving objects:  10% (25/245)
Receiving objects:  11% (27/245)
Receiving objects:  12% (30/245)
Receiving objects:  13% (32/245)
Receiving objects:  14% (35/245)
Receiving objects:  15% (37/245)
Receiving objects:  16% (40/245)
Receiving objects:  17% (42/245)
Receiving objects:  18% (45/245)
Receiving objects:  19% (47/245)
Receiving objects:  20% (49/245)
Receiving objects:  21% (52/245)
Receiving objects:  22% (54/245)
Receiving objects:  23% (57/245)
Receiving objects:  24% (59/245)
Receiving objects:  25% (62/245)
Receiving objects:  26% (64/245)
Receiving objects:  27% (67/245)
Receiving objects:  28% (69/245)
Receiving objects:  29% (72/245)
Receiving objects:  30% (74/245)
Receiving objects:  31% (76/245)
Receiving objects:  32% (79/245)
Receiving objects:  32% (79/245), 2.82 MiB | 2.62 MiB/s
Receiving objects:  32% (79/245), 5.13 MiB | 2.39 MiB/s
Receiving objects:  32% (79/245), 7.53 MiB | 2.38 MiB/s
Receiving objects:  32% (79/245), 9.49 MiB | 2.26 MiB/s
Receiving objects:  32% (79/245), 10.93 MiB | 1.91 MiB/s
Receiving objects:  32% (79/245), 12.75 MiB | 1.89 MiB/s
Receiving objects:  32% (79/245), 14.34 MiB | 1.71 MiB/s
Receiving objects:  32% (79/245), 15.92 MiB | 1.56 MiB/s
Receiving objects:  32% (79/245), 17.44 MiB | 1.52 MiB/s
Receiving objects:  32% (79/245), 18.47 MiB | 1.58 MiB/s
Receiving objects:  32% (79/245), 20.16 MiB | 1.58 MiB/s
Receiving objects:  33% (81/245), 20.16 MiB | 1.58 MiB/s
Receiving objects:  34% (84/245), 20.16 MiB | 1.58 MiB/s
Receiving objects:  35% (86/245), 20.16 MiB | 1.58 MiB/s
Receiving objects:  36% (89/245), 20.92 MiB | 1.57 MiB/s
Receiving objects:  37% (91/245), 20.92 MiB | 1.57 MiB/s
Receiving objects:  38% (94/245), 20.92 MiB | 1.57 MiB/s
Receiving objects:  39% (96/245), 20.92 MiB | 1.57 MiB/s
Receiving objects:  40% (98/245), 20.92 MiB | 1.57 MiB/s
Receiving objects:  41% (101/245), 20.92 MiB | 1.57 MiB/s
Receiving objects:  42% (103/245), 20.92 MiB | 1.57 MiB/s
Receiving objects:  43% (106/245), 20.92 MiB | 1.57 MiB/s
Receiving objects:  44% (108/245), 20.92 MiB | 1.57 MiB/s
Receiving objects:  45% (111/245), 20.92 MiB | 1.57 MiB/s
Receiving objects:  46% (113/245), 20.92 MiB | 1.57 MiB/s
Receiving objects:  47% (116/245), 20.92 MiB | 1.57 MiB/s
Receiving objects:  48% (118/245), 20.92 MiB | 1.57 MiB/s
Receiving objects:  49% (121/245), 20.92 MiB | 1.57 MiB/s
Receiving objects:  50% (123/245), 20.92 MiB | 1.57 MiB/s
Receiving objects:  51% (125/245), 20.92 MiB | 1.57 MiB/s
Receiving objects:  52% (128/245), 20.92 MiB | 1.57 MiB/s
Receiving objects:  53% (130/245), 20.92 MiB | 1.57 MiB/s
Receiving objects:  54% (133/245), 20.92 MiB | 1.57 MiB/s
Receiving objects:  55% (135/245), 20.92 MiB | 1.57 MiB/s
Receiving objects:  56% (138/245), 20.92 MiB | 1.57 MiB/s
Receiving objects:  57% (140/245), 20.92 MiB | 1.57 MiB/s
Receiving objects:  58% (143/245), 20.92 MiB | 1.57 MiB/s
Receiving objects:  59% (145/245), 20.92 MiB | 1.57 MiB/s
Receiving objects:  60% (147/245), 20.92 MiB | 1.57 MiB/s
Receiving objects:  61% (150/245), 20.92 MiB | 1.57 MiB/s
Receiving objects:  62% (152/245), 20.92 MiB | 1.57 MiB/s
Receiving objects:  63% (155/245), 20.92 MiB | 1.57 MiB/s
Receiving objects:  64% (157/245), 20.92 MiB | 1.57 MiB/s
Receiving objects:  65% (160/245), 20.92 MiB | 1.57 MiB/s
Receiving objects:  66% (162/245), 20.92 MiB | 1.57 MiB/s
Receiving objects:  67% (165/245), 20.92 MiB | 1.57 MiB/s
Receiving objects:  68% (167/245), 20.92 MiB | 1.57 MiB/s
Receiving objects:  69% (170/245), 20.92 MiB | 1.57 MiB/s
Receiving objects:  70% (172/245), 20.92 MiB | 1.57 MiB/s
Receiving objects:  71% (174/245), 20.92 MiB | 1.57 MiB/s
Receiving objects:  72% (177/245), 20.92 MiB | 1.57 MiB/s
Receiving objects:  73% (179/245), 20.92 MiB | 1.57 MiB/s
Receiving objects:  74% (182/245), 20.92 MiB | 1.57 MiB/s
Receiving objects:  75% (184/245), 20.92 MiB | 1.57 MiB/s
Receiving objects:  76% (187/245), 20.92 MiB | 1.57 MiB/s
Receiving objects:  76% (187/245), 21.76 MiB | 1.59 MiB/s
Receiving objects:  77% (189/245), 21.76 MiB | 1.59 MiB/s
Receiving objects:  78% (192/245), 21.76 MiB | 1.59 MiB/s
Receiving objects:  79% (194/245), 21.76 MiB | 1.59 MiB/s
Receiving objects:  80% (196/245), 21.76 MiB | 1.59 MiB/s
Receiving objects:  81% (199/245), 21.76 MiB | 1.59 MiB/s
Receiving objects:  82% (201/245), 21.76 MiB | 1.59 MiB/s
Receiving objects:  83% (204/245), 21.76 MiB | 1.59 MiB/s
Receiving objects:  84% (206/245), 21.76 MiB | 1.59 MiB/s
Receiving objects:  85% (209/245), 21.76 MiB | 1.59 MiB/s
Receiving objects:  86% (211/245), 21.76 MiB | 1.59 MiB/s
remote: Total 245 (delta 86), reused 209 (delta 57), pack-reused 0
Receiving objects:  87% (214/245), 21.76 MiB | 1.59 MiB/s
Receiving objects:  88% (216/245), 21.76 MiB | 1.59 MiB/s
Receiving objects:  89% (219/245), 21.76 MiB | 1.59 MiB/s
Receiving objects:  90% (221/245), 21.76 MiB | 1.59 MiB/s
Receiving objects:  91% (223/245), 21.76 MiB | 1.59 MiB/s
Receiving objects:  92% (226/245), 21.76 MiB | 1.59 MiB/s
Receiving objects:  93% (228/245), 21.76 MiB | 1.59 MiB/s
Receiving objects:  94% (231/245), 21.76 MiB | 1.59 MiB/s
Receiving objects:  95% (233/245), 21.76 MiB | 1.59 MiB/s
Receiving objects:  96% (236/245), 21.76 MiB | 1.59 MiB/s
Receiving objects:  97% (238/245), 21.76 MiB | 1.59 MiB/s
Receiving objects:  98% (241/245), 21.76 MiB | 1.59 MiB/s
Receiving objects:  99% (243/245), 21.76 MiB | 1.59 MiB/s
Receiving objects: 100% (245/245), 21.76 MiB | 1.59 MiB/s
Receiving objects: 100% (245/245), 22.39 MiB | 1.81 MiB/s, done.
Resolving deltas:   0% (0/86)
Resolving deltas:   1% (1/86)
Resolving deltas:   2% (2/86)
Resolving deltas:   9% (8/86)
Resolving deltas:  19% (17/86)
Resolving deltas:  26% (23/86)
Resolving deltas:  31% (27/86)
Resolving deltas:  37% (32/86)
Resolving deltas:  54% (47/86)
Resolving deltas:  59% (51/86)
Resolving deltas:  61% (53/86)
Resolving deltas:  66% (57/86)
Resolving deltas:  68% (59/86)
Resolving deltas:  69% (60/86)
Resolving deltas:  70% (61/86)
Resolving deltas:  72% (62/86)
Resolving deltas:  74% (64/86)
Resolving deltas:  75% (65/86)
Resolving deltas:  77% (67/86)
Resolving deltas:  80% (69/86)
Resolving deltas:  82% (71/86)
Resolving deltas:  83% (72/86)
Resolving deltas:  87% (75/86)
Resolving deltas:  88% (76/86)
Resolving deltas:  90% (78/86)
Resolving deltas:  93% (80/86)
Resolving deltas:  94% (81/86)
Resolving deltas:  95% (82/86)
Resolving deltas:  96% (83/86)
Resolving deltas: 100% (86/86)
Resolving deltas: 100% (86/86), done.
/home/wschoi/LASAFT-Net-v2/book/quickstart/LASAFT-Net-v2
# preliminaries: install python3-dev
!pip install hydra-core==1.1.0.rc1 pytorch-lightning>=1.4.1 rich wandb
import soundfile as sf
from IPython.display import display, Audio

mixture, _ = sf.read('data/test/sample/mixture.wav')

display(Audio(mixture.T, rate=44100))
from load_pretrained import get_mdx_light_v2_699

model = get_mdx_light_v2_699()
checkpoint conf/pretrained/v2_light/epoch=669.ckpt is loaded: 
result = model.separate_tracks(mixture, ['vocals', 'drums', 'bass', 'other'], overlap_ratio=0.5, batch_size=4)

print('separated vocals:')
display(Audio(result['vocals'].T, rate=44100))

print('separated drums:')
display(Audio(result['drums'].T, rate=44100))

print('separated bass:')
display(Audio(result['bass'].T, rate=44100))

print('separated other:')
display(Audio(result['other'].T, rate=44100))
/home/wschoi/exit/envs/tutorial-environment/lib/python3.9/site-packages/torch/functional.py:471: UserWarning: stft will soon require the return_complex parameter be given for real inputs, and will further require that return_complex=True in a future PyTorch release. (Triggered internally at  ../aten/src/ATen/native/SpectralOps.cpp:664.)
  return _VF.stft(input, n_fft, hop_length, win_length, window,  # type: ignore[attr-defined]
/home/wschoi/exit/envs/tutorial-environment/lib/python3.9/site-packages/torch/functional.py:545: UserWarning: istft will require a complex-valued input tensor in a future PyTorch release. Matching the output from stft with return_complex=True.  (Triggered internally at  ../aten/src/ATen/native/SpectralOps.cpp:817.)
  return _VF.istft(input, n_fft, hop_length, win_length, window, center,  # type: ignore[attr-defined]
separated vocals:
separated drums:
separated bass:
separated other:
%cd ..
%rm -rf LASAFT-Net-v2
/home/wschoi/LASAFT-Net-v2/book/quickstart